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@ Natural language processing system and method 



(|7) A natural language processing method, by 
which a sequence of natural language infor- 
mation is analyzed so as to derive a concept 
represented by the information. In this method, 
the input natural language information is se- 
quentially processed as word by word. At that 
time, the kind of a subsequent word is expected 
from a currently processed word by using 
knowledge concerning the word order of words 
in the natural language information. Thus the 
processing is performed by eliminating 
ambiguity in the information on the basis of 
such an expectation. 



FIG. 2 
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The present invention relates to a natural language processing system for performing natural language 
processing on the basis of information inputted thereto in a natural language and further relates to a method 
therefor. 

Hitherto, a system for processing a predetermined keyword and a system for performing syntactic analysis 
5 and semantic analysis on a text have been devised to process input information represented in a natural lan- 
guage. 

However, among these conventional systems, in the system using predetermined keywords, a very large 
number of keywords are necessary to realize a practical system. 

Further, in the system for performing syntactic analysis and semantic analysis on a text, even when this 
10 system is provided with a grammar and a considerably large dictionary, it is very difficult to uniquely determine 
the semantic role of each portion of the analyzed text and to decompose or partition a sequence of nouns into 
groups thereof. This is a serious problem, in the case where the text is represented in a language such as Jap- 
anese and modern Hindi, in which a verb is positioned at the end of a sentence. Moreover, in the case of the 
conventional system for performing syntactic analysis, it is difficult to process input information if the informa- 
15 tion is an incomplete sentence. 

Therefore, extraction of useful data from the contents of natural language input information concerning a 
specific field, which is an easy task for a human being, can not be easily achieved by using the conventional 
machine. 

Accordingly, an object of the present invention is to provide a natural-language processing system which 

20 can process an incomplete sentence as input information, without performing syntactic analysis. 

Further, another object of the present invention is to provide a natural-language processing method, by 
which an incomplete sentence can be processed as input information without performing syntactic analysis. 

A first aspect of the present invention aims to provide a natural-language processing system which can 
eliminate ambiguity in information by expecting a kind of subsequent information. 

25 Furthermore, another aspect of the present invention aims to provide a natural-language processing meth- 

od, by which ambiguity in information can be eliminated by expecting a kind of subsequent information. 

In accordance with another aspect of the present invention, there is provided a natural language process- 
ing system comprising, input means for inputting information represented in a natural language, a knowledge 
base for storing linguistic knowledge and general knowledge, partition means for partitioning the Information, 

30 which is inputted by the input means, into words, derivation means for referring to knowledge stored in the 
knowledge base and for deriving concepts respectively represented by the words obtained by the partition 
means, and integration means for relating the concepts of the words, which are derived by the derivation 
means, with one another by referring to knowledge stored in the knowledge base. 

Further, in accordance with another aspect of the present invention, there is provided a natural language 

35 processing system comprising, input means for inputting information represented in a natural language, a 
knowledge base for storing therein linguistic knowledge and knowledge concerning a domain of information 
to be processed, expectation means for expecting a kind of information to be inputted from the input means, 
on the basis of the knowledge stored in the knowledge base, and processing means for processing information 
whose kind is expected by the expectation means. 

40 Moreover, in accordance with still another aspect of the present invention, there is provided a natural lan- 

guage processing system comprising, input means for inputting information represented in a natural language, 
a knowledge base for storing therein knowledge concerning a domain of information to be processed, general 
knowledge and linguistic knowledge, expectation means for expecting a kind of information to be inputted by 
the input means, on the basis of the knowledge stored in the knowledge base, expectation information storing 

45 means for storing a result of an expectation made by the expectation means, as expectation information, and 
analysis means for analyzing information inputted from the input means by referring to the expectation infor- 
mation stored in the expectation information storing means and to the knowledge stored in the knowledge base. 

Furthermore, in accordance with yet another aspect of the present invention, there is provided a natural 
language processing method comprising, the input step of inputting information in a natural language, the par- 

50 tition step of partitioning the information inputted in the input step, the derivation step of referring to knowledge 
stored In a knowledge base, which stores linguistic knowledge and general knowledge, and deriving concepts 
represented by the words obtained in the partition step, and the integration step of relating the concepts, which 
are derived respectively correspondingly to the words in the derivation step, with one another by referring to 
the knowledge stored In the knowledge base. 

55 Additionally, in accordance with a further aspect of the present invention, there is provided a natural lan- 

guage processing method comprising, the input step of inputting information in a natural language, the expec- 
tation step of expecting the kind of information inputted in the input step on the basis of knowledge stored in 
the knowledge base, which stores therein linguistic knowledge and knowledge concerning the field of infor- 
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mation to be processed, and the step of processing the information whose kind is expected in the expectation 
step. 

Further, in accordance with still further aspect of the present invention, there is provided a natural language 
processing method comprising, the Input step of inputting information in a natural language, the expectation 

5 step of expecting the kind of Information inputted in the Input step on the basis of knowledge stored in the knowl- 
edge base, which stores therein knowledge concerning a domain of information to be processed, genera! 
knowledge and linguistic knowledge, the expectation information storing step of storing information, which rep- 
resents a result of an expectation made in the expectation step, in an expectation information memory as ex- 
pectation information, and the analysis step of analyzing the information inputted in the input step by referring 

10 to the expectation information stored in the expectation information memory and the knowledge stored in the 
knowledge base. 

Other features, objects and advantages of the present invention will become apparent from the following 
description of a preferred embodiment with reference to the drawings in which like reference characters des- 
ignate like or corresponding parts throughout several views. 

75 

BRIEF DESCRIPTION OF THE DRAWINGS 



FIG. 1 is a block diagram for illustrating the configuration of the hardware of a natural language processing 
system, embodying the present invention, namely, the embodiment of the present invention; 
20 FIG. 2 is a functional block diagram for illustrating the fundamental configuration of the natural language 

processing system embodying the present invention; 

FIG. 3 is a diagram for illustrating the detailed configuration of a knowledge base; 
FIG. 4 is a diagram for illustrating the detailed configuration of a conceptual analyzer 
FIG. 5 is a flowchart for illustrating an analysis process; 
25 FIG. 6 is a detailed flowchart for illustrating a process of obtaining the meaning/concept of a word; 

FIGS. 7a and 7b are block diagrams which illustrate the configurations of English Linguistic Knowledge 
Base (LKB) and Japanese LKB, respectively; 

FIG. 8 is a diagram for illustrating the structure of a form dictionary; 
FIG. 9 is a diagram for illustrating the structure of a grammar dictionary; 
30 FIG. 10 is a diagram for illustrating the structure of a conceptual dictionary; 

FIG. 11 is a diagram for illustrating the structure of "Public Document"; 

FIG. 12 is a flowchart for illustrating a process of searching for a concept corresponding to a word; 
FIG. 13 is a diagram for illustrating an example of the process of searching a concept corresponding to a 
word; 

35 FIG. 14 is a diagram for illustrating a database of general knowledge; 

FIG. 1 5 is a diagram for illustrating an example of the knowledge structure of a physical object; 

FIG. 16 is a diagram for illustrating primary structures classed as subtypes of the physical object; 

FIG. 17 is a diagram for illustrating the knowledge structure of a person; 

FIG. 18 is a diagram for illustrating an example of the knowledge structure of the person; 
40 FIG. 19 is a diagram for illustrating an example of a draft instance of person; 

FIG. 20 is a diagram for illustrating the knowledge structure of an abstract entity; 

FIG. 21 is a diagram for illustrating primary structures classed as subtypes of the abstract entity; 

FIG. 22 is a diagram for illustrating the knowledge structure of an organization; 

FIG. 23 is a diagram for illustrating an example of the knowledge structure of the organization; 
45 FIG. 24 is a diagram for illustrating the knowledge structure of a knowledge domain and an example there- 

of; 

FIG. 25 is a diagram for illustrating primary structures classed as subtypes of space; 

FIG. 26 is a diagram for illustrating the knowledge structures of a place and a country and an example 

thereof; 

50 FIG. 27 is a diagram for illustrating primary structures classed as subtypes of a unit time; 

FIG. 28 is a diagram for illustrating the knowledge structures of the unit time and a time; 

FIG. 29 is a diagram for illustrating the knowledge structure of an action; 

FIG. 30 is a diagram for illustrating primary structures classed as subtypes of the ACTION; 

FIG. 31 is a diagram for illustrating the knowledge structure of MEET; 
55 FIG. 32 is a diagram for Illustrating the draft instance of the MEET; 

FIG. 33 is a diagram for illustrating the knowledge structure of PTRANS; 

FIG. 34a is a diagram for illustrating the draft instance of the PTRANS; 

FIG. 34b is a diagram for Illustrating the draft instance of PTRANS for the verb "come"; 
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FIG. 35 is a diagram for illustrating the knowledge structure of MTRANS; 
FIG. 36 is a diagram for illustrating the knowledge structure of MSENSE; 
FIG. 37 is a diagram for illustrating the knowledge structure of AGREEMENT; 

FIG. 38 is a flowchart for illustrating a processing to be performed by a post-conceptual-analyzer (here- 
5 under abbreviated as post-CA); 

FIGS. 39a, 39b and 39c are diagrams for illustrating objects and results of a processing to be performed 
by the post-CA; 

FIGS, 40a to 40i, 41a to 41 h, 42, 43 and 44 are practical examples of input information, information on 

processes and results of a processing by CA; 
10 FIG. 45 is a diagram for Illustrating elements which belong to the place; 

FIG. 46 is a diagram for illustrating the knowledge representation of the country; 

FIG. 47 is a diagram for illustrating the general knowledge structure of the place; 

FIG. 48 is a diagram for illustrating the knowledge structure of the city; 

FIG. 49 is a diagram for illustrating the representation of YOKOHAMA city; 
15 FIG. 50 is a diagram for illustrating the general knowledge structure of an address; 

FIG. 51 is a diagram for illustrating the detailed knowledge representation of the places in JAPAN; 

FIG. 52 is a diagram for illustrating the knowledge structure of the person; 

FIG. 53 is a diagram for illustrating the knowledge structure of the action; 

FIG. 54 is a diagram for illustrating the knowledge structure of a family register; 
20 FIG. 55 is a diagram for illustrating the knowledge structure of a page of the family register; 

FIG. 56 is a diagram for illustrating the knowledge structure of a block of the family register; 

FIG. 57 is a diagram for illustrating an example of a general dictionary; 

FIG. 58 is a diagram for illustrating an example of a dictionary of the field of the family register; 
FIG. 59 is a diagram for illustrating an example of a general description rules; 
25 FIG. 60 is a diagram for illustrating example rules for description in the field of the family register; 

FIG. 61 is a diagram for illustrating an example of read/input family register information; 
FIG. 62 is a diagram for illustrating the structure of the initialized family register information; and 
FIG. 63 is a diagram for illustrating the structure of analyzed family register information. 

30 DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

Hereinafter, the preferred embodiment of the present invention will be described in detail by referring to 
the accompanying drawings. 

FIG. 1 is a block diagram for illustrating the configuration of the hardware of a natural language processing 
35 system embodying the present invention, namely, the embodiment of the present invention. 

In this figure, reference numeral 1 designates an input unit for inputting information. In the following de- 
scription, this input information will be referred to as a text. This system is, however, also able to process a 
grammatically incomplete or erroneous sentences. Reference numeral 2 denotes a central processing unit 
(CPU) which is operative to perform operations for various processing and make a logical decision or the like 
40 and further controls each composing element connected to a bus 6. Further, reference numeral 3 designates 
an output unit for outputting information. 

Reference numeral 4 denotes a program memory, namely, a memory for storing therein programs to be 
executed by the CPU 2 so as to perform control operations including a procedure to be described later by re- 
ferring to a flowchart thereof. Either of a read-only memory (ROM) and a random access memory (RAM), to 
45 which a program is loaded from an external memory unit or the like, may be employed as the program memory 
4. 

Reference numeral 5 designates a data memory which stores knowledge contained in a knowledge base 
(to be described later) in addition to data generated when performing various kinds of processing. The data 
memory is, for example, a RAM. Knowledge contained in the knowledge base is loaded from a nonvolatile ex- 
50 ternal storage medium into the data memory 5 before the processing is performed. Alternatively, the knowledge 
contained in the knowledge base is referred to each time when such knowledge becomes necessary. 

Reference numeral 6 denotes a bus used to transfer an address signal indicating a composing element to 
be controlled by the CPU 2, a control signal used for controlling each composing element and data to be ex- 
changed between the composing equipments. 
55 FIG. 2 is a functional block diagram for illustrating the fundamental configuration of the natural language 

processing system embodying the present invention. 

The input unit 1 of this figure is a unit used to Input information represented in a natural language. For 
example, a keyboard for keying characters, a speech or voice recognition device for inputting and recognizing 
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speech sounds, a character recognition device for optically reading characters from a document and recog- 
nizing the characters, and a receiver for receiving information from other systems, or the like may be employed 
as the input unit 1 . Further, information generated by performing another operation in this processing system 
may also be used as input information thereto. Moreover, two or more of these devices may be simultaneously 
5 employed as the input unit, and one of the employed devices may be selected in such a manner as to be cur- 
rently used as the input unit. 

CA21 is operative to refer to the knowledge contained in the knowledge base 22 and extract a concept 
described in a natural language text, which is inputted by the input unit 1 , thereto. The details of the CA21 will 
be described later. 

10 An output unit 3 is used to output data obtained from the CA 21 . For example, a speech synthesis device 

for synthesizing speech sounds from character information, a display device such as a cathode-ray-tube (CRT) 
display and a liquid crystal display, a printer for providing printouts, or a transmitter for transmitting information 
to other systems may be employed as the output unit 3. 

Further, an output of the output unit may be used as input information to another portion of this system. 

15 Alternatively, two or more of these devices may be simultaneously employed as the output unit, and one of 
the employed units may be selected in such a manner as to be currently used as the output unit. 

FIG. 3 is a diagram for illustrating the detailed contents of a knowledge base 22. Reference numeral 221 
designates a world knowledge base (WKB) which has general knowledge such as knowledges concerning the 
structure of a "PLACE"; 222 a domain knowledge base (DKB) which has knowledge particular to a field of an 

20 object to be processed; and 223 an LKB which has linguistic information such as knowledge concerning parts 
of speech and a grammar. 

The aforementioned CA 21 analyzes natural language information inputted by a user and converts this 
information to concepts which represent the meaning of this information. The CA21 obtains a semantic concept 
equivalent to input information therefrom and performs a processing on the basis of the semantic concept in- 

25 stead of first performing a syntactic analysis on the inputted natural language information and then giving 
meaning to the analyzed structure of a sentence as in the case of a conventional sentence analyzer. Further, 
the CA21 performs a processing by expecting subsequent information from preceding information contextu- 
ally, semantically and grammatically. 

In the CA 21, when processing input information, the emphasis is put on the meaning thereof instead of 

30 generating the construction of a sentence by performing a syntactic analysis or parsing. Thereby, this system 
can handle even a grammatically erroneous statement and an incomplete sentence such as a piece of a state- 
ment. The CA 31 expects the subsequent word in the process of processing each word so as to make good 
use of the expected subsequent word in the subsequent process. 

In the case of a conceptual analysis approach, a dictionary of the meaning of words plays a central role 

35 in the processing. Thus, the CA 21 does not utilize any definite grammar. The CA 21 , however, does not com- 
pletely dispense with utilization of constructive characteristics. Namely, the CA21 utilizes a noun group con- 
stituency when a term having a meaning is constituted by a plurality of words. 

FIG. 4 is a diagram for illustrating the detailed configuration of the CA 21 . Here, a pre-CA 211 is operative 
to perform a pre-processing which will be described later by referring to FIG. 5. A main CA 212 is operative to 

40 perform a main processing (to be described later by referring to FIGS. 5 and 6) on each input word. Further- 
more, a post-CA 21 3 is operative to perform a post-processing which will be described later by referring to FIGS. 
38 and 39. 

The CA 21 utilizes a expectation list, a word list, a C-list and a request list so as to effectively process 
input information. 

45 The expectation list referred to hereafter as "ELIST", is a list in which expectation a concerning the next 

concept/word is maintained. 

The word list referred to hereafter as "WLIST", is a list in which the meaning of all words, a REQUEST 
representing the mutual relation between preceding and subsequent words and word information containing 
an instance is maintained. 

50 The C-list is a list in which word-related linguistic concepts containing both of linguistic and conceptual 

information about words is maintained. 

The request list referred to hereafter as "RLIST", is a list in which requests which are active for each concept 
are maintained. 

FIG. 5 is a flowchart for illustrating a procedure by which the CA 21 performs the processing. 
55 A process consisting of steps S501 and S502 is performed by the pre-CA 211. Further, another process 

consisting of steps S503 to S512 is performed by the main CA 212. 

First, in step S501 , initial expectation information is determined about the next input word. This is a primary 
processing to be performed by the pre-CA 211 on the basis of the grammar of the language, the meaning of 
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words and the context (for instance, an answer presumed from a question in a colloquy). 

For example, in the case where input information is represented in English and in the active voice, the struc- 
ture of a sentence is grammatically restricted as follows: 
Subject - Verb - Object. 

5 The structure of the sentence is further restricted owing to the meaning thereof as follows: 
Subject (Actor or Entity) - Verb (Action or State Descriptor) - Object 
where an actor may be a person, an organization or the like. The knowledge structures of a person and an 
action are shown in FIGS. 17 and 29, respectively. Further, the expectation can be set or modified on the basis 
of the context. 

10 In step S502, the draft instance of the expected concept of the sentence is created. As described above, 

an input to the OA 21 is a set of natural language words. The OA 21 performs the processing (to be described 
later) on each of alt of the words inputted thereto. 

In step S503, it is judged whether an unprocessed next word exists. If not, the processing is terminated. 
If next word exists, the concept corresponding to that the word is obtained in step S504. Then, in step S505, 

15 it is checked, whether this concept is a verb. If so, the concept of the sentence is modified and the expectation 
about the next word is selected in step 8506. In step S512, the expected information is updated and the program 
returns to step S503. If it is found in step S505 that the obtained concept is not a verb, it is further checked in 
step S507 whether or not the obtained concept is a casemarker, such as a particle, for prescribing the case 
structure or a delimiter for separating individual items. If the obtained concept is a casemarker or a delimiter, 

20 this casemarker or delimiter is attached before or after the concept in step S508. Then, the program returns 
to step S503 via step S512 as before. If it is found in step S507 that the obtained concept is neither a casemarker 
nor a delimiter, it is further checked in step 8509 whether or not the obtained concept is a qualifier. If so, the 
concept waits until it finds the appropriate concept (noun/verb) to attached itself to, and is then attached to 
that concept in step S510. Subsequently, the programs returns to step S503 after following the step 8512. If 

25 it is found in step 8509 that the obtained concept is not a qualifier, this concept is filled into the draft concept 
of the statement. Then, the expected information is updated in step 8512 and the program returns to step 8503 
to process any remaining words. 

FIG. 6 is a detailed flowchart for illustrating the processing to be performed in step 8504, namely, the proc- 
ess of obtaining the meaning/concept of the word. 

30 First, in step 8601, the next unprocessed word is obtained. Incidentally, in the case where words are not 

explicitly separated in a statement as in the case of Japanese, it becomes necessary to partition an input state- 
ment in word units. This is, however, unnecessary in the case where there is a space between words in a state- 
ment as in the case of English. Further, this partitioning process is performed as a part of this step or prior to 
it. 

35 Next, in step S602, the LKB 233 is searched. If there exists a word same as the word to be processed, 

the program advances to step S608. However, if the word is not found in the LKB 233, the program advances 
to step 8603 whereupon the ending of the word to be processed (-ed, -ings, -s or the tike in the case of English) 
is deleted according to wordform rules. Then, the LKB is searched for the same word as the obtained word 
(which is called a "root"). If that word exists, the program advances to step 8608. Conversely, in the case where 

40 the word is not found in the LKB, the program advances to step 8605 whereupon the spelling of the word is 
checked by a word corrector (not shown) and the user's misspelling is corrected. Thereafter, the program re- 
turns to step 8602 whereupon the LKB 233 is searched again for the word obtained by the correction. However, 
if the same word is not found, or if there is no possible spelling error found in step 8605, the program advances 
to step S607. in this step 8607, the WKB 231 is searched for the same word. If the word is found therein, the 

45 program advances to step 8608. However, if it is not found, the program advances to step 8609. 

In step 8608, a draft instance is created for the concept corresponding to the word being processed, from 
a given word rule. Thus the program is finished. The draft instance is the instance of the knowledge structure. 
Further, all of slots of this structure are filled with empty instances of entities which can possibly fill the slots. 
In step 8609, an "unknown concept" is created. Then, the program is finished. 

50 FIGS. 7a and 7b are block diagrams which illustrate the configurations of English Linguistic Knowledge 

Base (LKB) and Japanese LKB, respectively. 

Reference characters 71a and 71 b are form dictionaries which store groups of words represented in various 
forms appearing in a natural language text therein respectively corresponding to the languages. Further, ref- 
erence characters 72a and 72b are word-tag dictionaries in which one word-tag corresponds to one meaning 

55 of each word. Reference characters 73a and 73b are grammar dictionaries in which grammatical information 
concerning each word is described. Reference characters 74a and 74b are conceptual dictionaries in which 
the concept(s) corresponding to each word is described. Reference characters 75a and 75b are semantic dic- 
tionaries In which the meaning of each word as given in that language is described. Each word-tag is connected 
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to one or more of the words stored in the corresponding form dictionary 71a or 71b, has its grammatical infor- 
mation described in the corresponding grammar dictionary 73a or 73b, has corresponding the concept(s) de- 
scribed in the corresponding conceptual dictionary 74a or 74b and has its meaning described in the corre- 
sponding semantic dictionary 75a or 75b, 

5 In this dictionary, different word-tags are assigned to different meanings of one word. Thus, a single mean- 

ing is made to correspond to each word-tag. For instance, an English word "book" has two meanings, therefore, 
two word-tags correspond to the word "book". One of the word-tags corresponds to a noun which means sheets 
of paper fastened together, on which matters are described, the other word-tag corresponds to a verb whose 
meaning is to reserve something. 

10 FiG. 8 is a diagram for illustrating the structure of an example of the form dictionary 71a. The form dic- 

tionary is used to establish the relation among the words, the word-tags and the grammatical forms. In the 
case of a verb, the possible grammatical forms are the PASTFORM (namely, the past form), the PRESFORM 
(namely, the present form), the PROGFORM (namely, the progressive form) etc. Further, in the case of a noun, 
the SINGULAR (namely, the singular form) and the PLURAL (namely, the plural form) forms may also exist. 

15 In the case of a pronoun, the SUBJECTIVE (namely, the subjective form) and the OBJECTIVE (namely, the 
objective form) etc. exist. These are some of the examples other categories may be provided. For instance, in 
the case of a noun, the classification may be made according to whether or not the noun is countable, what 
gender the noun has, whether the noun is a common noun or a proper noun, and so forth. Furthermore, this 
dictionary, also contains other information such as whether a word is of the British style, or of the American 

20 style, whether a word is in colloquial style, or in literary style and so on. 

FIG. 9 is a diagram for illustrating an example of the grammar dictionary 73. This dictionary contains syn- 
tactic information concerning a word, and information concerning the position of a word in a natural language 
sentence. For example, regarding a verb, a large amount of position information corresponding to the cases 
of the active and passive voices or to the point of focus is stored therein. Further, regarding a noun, the structure 

25 of a noun phrase is determined according to the point of focus and the important points thereof. The position 
information is used to specify the order in which necessary attributes of the concept of a word should be de- 
scribed. For instance, the position information concerning a verb "go" is "Actor", "wform" and "* lobj_direc". 
Thus, each slot is specified according to the "Actor" (on the basis of the information "Actor") and the "Verb" 
(in accordance with the fact that the form is based on the information "wform"). Further, the destination (spe- 

30 cified by the information "lobj_direc") appears only in this order. The mark "*" represents a preposition or a 
casemarker. This slot indicates that a preposition is employed. The exact preposition is determined as per the 
rule applicable under the circumstances. 

In the case of the form of a noun, a certain noun, for example, "discussion" implies an action and requires 
the information concerning a slot of a verb derived from this word, namely, "discuss". Further, the word-tag 

35 for the corresponding verb is stored at an entry point. 

FIG. 10 is a diagram for illustrating an example of the conceptual dictionary 74. The conceptual dictionary 
74 represents a mapping from a word-tag to a concept. Each word-tag corresponds to an associated concept. 
As many word-tags may be mapped to the same concept, this mapping is an n-to-one mapping (where, "n" is 
one or more). Thus, in order to have a unique inverse mapping requires a rule or a condition. Such a rule pro- 

40 vides information which sets conditions on the filling of slots of the knowledge structure of the concept in such 
a manner as to represent specific word-tags. 

Further, the rules are arranged in the order of specific to general ones. For example, a rule corresponding 
to the word "come" is more specific than a rule corresponding to the word "go". Thus the former rule is ranked 
higher than the latter rule. 

45 The CA 21 accesses this dictionary by using a word-tag. and next extracts a rule corresponding this word- 

tag, and further uses this rule so as to generate a request. 

FIG. 11 is a diagram for illustrating the structure of "Public Document". 

FIG. 12 is a flowchart for illustrating the process of searching for the concept(s) corresponding to an input 
word. 

50 FIG. 13 illustrates this same process for the input word "book". 

In step SI 31, the word "book" is inputted to the system. In step S132, the form dictionary 71a (see FIG. 
8) is searched for the inputted word "book". As shown in FIG. 8, there are two corresponding word-tags "bookl" 
and "book2". Therefore, the concepts "PUBLIC-DOCUMENT" and "AGREEMENT" corresponding to the two 
word-tags, respectively, are extracted from the conceptual dictionary in step SI 33. Then, the conditions cor- 

55 responding to each of these concepts are extracted from the dictionary (Fig. 10) and the knowledge structure, 
which are shown in FIGS. 11 and 37, respectively, in step S134. Subsequently, the concept and the conditions 
corresponding thereto are outputted in step SI 35. Further, in the case where no words are found in step S132, 
such a fact is displayed in step S136. 
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FIG. 14 is a diagram for illustrating a database of general knowledge. 

FIG. 15 is a diagram for illustrating the knowledge structure of a physical object FIG. 16 is a diagram for 
illustrating primary structures classed as subtypes of the physical object. 

FIG. 1 7 is a diagram for illustrating the knowledge structure of a person . FIG. 1 8 is a diagram for illustrating 
5 an example thereof. FIG. 19 is a diagram for illustrating an example of the draft instance of person. 

FIG. 20 is a diagram for illustrating the knowledge structure of an abstract entity. FIG. 21 is a diagram for 
illustrating primary structures classed as subtypes of the abstract entity, 

FIG. 22 is a diagram for illustrating the knowledge structure of an organization. FIG. 23 is a diagram for 
illustrating an example of the organization. 
10 FIG. 24 is a diagram for illustrating the knowledge structure of a knowledge domain and an example there- 

of. 

FIG. 25 is a diagram for illustrating primary structures classed as subtypes of space. FIG. 26 is a diagram 
for illustrating the knowledge structures of a place and a country and an example thereof. 

FIG. 27 is a diagram for illustrating primary structures classed as subtypes of unit time. FIG. 28 is a dia- 
ls gram for illustrating the knowledge structures of the unit time and time, 

FIG. 29 is a diagram for illustrating the knowledge structure of an action. FIG. 30 is a diagram for illustrating 
primary structures classed as subtypes of the action. 

FIG. 31 is a diagram for illustrating the knowledge structure of MEET. FIG. 32 is a diagram for illustrating 
the draft instance of MEET. 

20 FIG. 33 is a diagram for illustrating the knowledge structure of PTRANS. FIG. 34a is a diagram for illus- 

trating the draft instance of PTRANS, FIG. 34b is a diagram for illustrating the draft instance of PTRANS for 
the verb "come". 

FIG. 35 is a diagram for illustrating the knowledge structure of MTRANS. FIG. 36 is a diagram for illus- 
trating the knowledge structure of MSENSE. FIG. 37 is a diagram for illustrating the knowledge structure of 
25 AGREEMENT. 

FIG. 38 is a flowchart for illustrating a processing to be performed by the post-CA213. Further, FIGS. 39a, 
39b and 39c are diagrams for illustrating objects and results of a processing to be performed by the post-CA 
213. 

First, in step S381, in FIG. 38, a beautification concept is identified. For instance, in the case where the 
30 input information represents "I would like to meet you", result of the processing performed by the main CA212 
is illustrated as OUTPUT in FIG. 39a. In this case, the outer AD "WANT" is identified as the beautification con- 
cept in step S381. 

Next, in step S382, associated statements are connected with each other. For instance, in the case where 
the statements (1) and (2) of FIG. 39b are inputted, result of the processing performed by the main CA 212 
35 becomes two AD's shown as OUTPUT in FIG. 39b, In this case, two AD's are connected to each other as cause- 
effect, as a result of step S382, as illustrated in this figure by OUTPUT-P. 

Further, in step S383, the unreasonable connections between AD's are severed. For instance, in the case 
where the input information represents "reserve accommodation and pick up", result of the processing per- 
formed by the main CA 212 is shown as OUTPUT in FIG. 39c. Here, the concept "MEET" (corresponding to 
40 "pick up") is connected to the concept "AGREEMENT" by an AND relation (Incidentally, this means "and"). This 
is disconnected so as to becomes two independent AD's in this step as shown in OUTPUT-P. 

Hereinafter, a practical example of the procedure will be described. First, it is assumed that the information 
shown in FIG. 40a is inputted. The processing is performed according to the flowcharts of FIGS. 5 and 6. 
First, in the pre-CA 211 , ELIST, WLIST, CLIST, RUST are initialized. The input information of FIG. 40b is 
45 set as the initial expectation at the start of the processing. 

In the main CA212, the word "John" is as the first word to be processed. When the LKB is searched, the 
meaning of the word "John" is found as illustrated in FIG. 40c. The information of FIG. 40c indicates that the 
word "John" is the name of a person and the part of speech thereof is a noun. Consequently, the draft instance 
CI given in FIG. 40d Is created. Further, the linguistic concept Lcl (as illustrated in FIG. 40e) corresponding 
50 to CI is also created. Thus, the expectation that the word represents a subject is satisfied. Therefore, the next 
expectation is set for a verb. 

Then, "sent" is found to be the next word to be processed. The searching of the LKB reveals that this is 
the past form of the verb "send" and the corresponding information from LKB is given in FIG. 40f. The con- 
straints for "send", as specified by the LKB, and the constraints inherent to the knowledge structure of the 
55 PTRANS are merged together to creat the draft instance C2 given in FIG. 40g. Furthermore, the linguistic con- 
cept Lc2 corresponding to C2 is created (given In FIG. 40h). Then, the position infonnation (see FIG. 40i), ac- 
cording to which, a sentence using this verb may be formed, is read from the LKB. The subject of the verb, 
namely, the concept CI (namely, a person) Is filled in the actor slot of C2. The next expectation is set to be 
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for a grammatical object which can be the OBJECT (namely, a physical or abstract entity) of C2. 

The next word to be processed is found to be "book". When searching the LKB, two meanings (namely, 
AGREEMENT and PUBLIC-DOCUMENT) given in FIG. 41a are obtained. However, as the expectation at this 
stage is for an OBJECT (namely, the physical object or abstract entity) the AGREEMENT meaning is not suit- 

5 able. Thus, only the PUBLIC-DOCUMENT meaning is applicable. Consequently, the ambiguity of the word 
"book" is eliminated. The draft instance C3 and the corresponding linguistic concept Lc3 of FIG. 41 b are created 
similarly as in the aforementioned case. As the draft instance C3 does not correspond to a person, only the 
Position Infol of the position information given in FIG. 40i is applicable and the draft Instance C3 Is inferred to 
be the object of PTRTANS (i.e., C2). From the Position Infol, the next expectation is set to be for a person cor- 

10 responding to the "lobj-benf of the PTRANS, which is expected to be preceded by a casemarker (namely, "to"). 

Then, "to" is found to be the next word. Its meaning given in FIG. 41c is found from the LKB and corre- 
spondingly the draft instance C4 of FIG. 41 d and the linguistic concept Lc4 of FIG. 41 e are created similarly 
as in the aforesaid cases. As the result of finding this word "to", the revised expectation is set for "lobj-benf 
(Person)", or "lobj-benf (Place)", or "lobj-benf (Action)". 

15 Then, "Mary" is found to be the next word. Its meaning given in FIG. 41f is found from the LKB, the draft 

instance C5 of FIG. 41g and the linguistic concept Lc5 of FIG. 41 h are created similarly as In the aforesaid 
cases. Moreover, because the request for the casemarker is satisfied, the draft instance C5 Is attached as the 
"lobj-benf of the PTRANS (i.e., C2). Final result is shown in FIG.42. 

Hereunder, another example will be described by referring to FIG. 43. In the case where the question (1) 

20 of this figure is inputted, possible answers are illustrated as those (a) to (c) in this figure. Thus the pre-CA 
sets expectation from the context (in this case, a question in a colloquy) so that even an incomplete statement, 
which would not be easily parsed by an ordinary analysis depending upon the construction thereof, can be 
analyzed. In this case, the pre-CA of the CA of this embodiment sets "the purpose of a visit" as the expectation. 
Therefore, the answer (a) of this figure can be analyzed. 

25 Further, in the case of inputting the information (2) of this figure, the pre-CA and the main CA perform the 

processing similarly as in the foregoing cases. The post-CA, however, analyzes the structure of the concept 
and performs the deletion or the modification thereof to thereby output only necessary Information. In the case 
of this illustrative statement, an output (d) of this figure, which is equivalent to the sentence "I want to write to 
John", is obtained. 

30 In the case where the statement (1 ) of FIG. 44 is inputted as a further example, draft instances are^ created 

depending upon the subsequent information, namely, upon the meaning of the statement, correspondingly to 
the two meanings of the word "visit". For instance, in the case where the continued input (2) is the expression 
(c) of this figure, the meaning of the word "visit" is Inferred to be (a) of this figure from the fact that the person 
(namely, the listener) is "you". Thus the ambiguity in the statement is eliminated. In the case where the con- 

35 tinued input (2) is the expression (d) of this figure, the meaning of the word "visit" is inferred to be (b) of this 
figure from the fact that the "Human Computer Interaction labs" is a building. Thus the ambiguity in the state- 
ment Is similarly eliminated. 

Next, the case of processing Japanese family register information will be described hereinafter as still an- 
other example of processing natural language Information described or represented in the predetermined for- 

40 mat. The Japanese family register information is not described in the form of a continuous text but is formed 
by enumerating predetermined items in a predetermined order. 

FIG. 45 is a diagram for Illustrating the knowledge structure of the place stored in the WKB 51 . 
FIG. 46 is a diagram for illustrating the relation among the country, the prefecture, the city and so on as 
a hierarchical knowledge structure. 

45 FIG. 47 is a diagram for illustrating the general knowledge structure of the place. This structure has three 

slots S1. S2 and S3. The slot SI represents <NAME>, in which the name of the place Is described. The slot 
S2 represents <Owns Places>, wherein the places owned by this place are listed. The slot S3 represents <Be- 
longs to Places>. wherein the place to which this place belongs is specified. 

FIG. 48 Is a diagram for Illustrating the knowledge structure in the case where the place of the foregoing 

50 description of FIG. 47 is a city. In the slot SI, the name of the city is described. If the city has one or more 
wards (namely, "ku" in Japanese), all of the names of the wards are described in the slot S2. Even if the city 
has no ward but has one or more towns (namely, "cho" or "machi" In Japanese), each of which has a rank lower 
than the rank of the city by one level, all of the names of the towns are described in the slot S2. In slot S3, 
name of the state/prefecture (namely, "ken" In Japanese) to which the city belongs, is specified. 

55 FIG. 49 is a diagram for illustrating a practical example of the knowledge structure of the city of FIG. 48, 

namely, the knowledge structure of YOKOHAMA city. 

FIG. 50 Is a diagram for illustrating the general knowledge structure of an address. As shown In this figure, 
there are seven slots S1 to S7, which correspond to the country, the prefecture, the city, the "ku" (namely, 
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the ward), the "cho" (namely, the town), the "chome" and the "banchi" (nannely, the tot No.), respectively. 

FIG. 51 is a diagrann for illustrating the hierarchical knowledge structure of the places in JAPAN. 

FIG. 52 is a diagram for illustrating the knowledge structure of the "person" as used for analyzing the family 
register. 

5 FIG. 53 is a diagram for illustrating the knowledge structure of the "action". The slot S1 corresponds to 

the agent of the action; the slot S2 to the Object (namely, the direct object) of the action; the slot S3 to the 
lobject (namely, the indirect object) of the action; the slot S4 to an action serving as the reason for the indirect 
object; the slot S5 to a place from which the action is performed; the slot S6 to an action serving as the in- 
strument; the slot S7 to a place serving as the support; and the slot S8 to the time. 

10 In the foregoing description, the knowledge structures stored in the WKB have been described. Next, the 

knowledge structures depending upon the field of information to be processed will be described hereinbelow. 
Hereunder, the knowledge for processing a family register will be described. 

FIG. 54 is a diagram for illustrating the knowledge structure of a family register The slot S1 corresponds 
to the legal domicile; the slot S2 to the holder of the family register; the slot S3 to the action; the slot S4 (de- 

15 scribed on the bottom line) to the page number In the case of this S4, <PAGE> is put in braces {}. This means 
that the page is an optional slot. 

FIG. 55 is a diagram for illustrating the knowledge structure of the page of the family register. FIG. 56 is 
a diagram for illustrating the knowledge structure of the block of the family register. The slot S1 corresponds 
to the owner of this block; the slot S2 to the declaration; the slot S3 to the relation between a person and the 

20 owner of the family register; the slot S4 to the Rank of a female or male child among the children of the same 
sex of the owner of the family register; and the slot S5 to the distinction of sex. Additionally, if a child is an 
adopted one, such a fact is described in the slot S6. 

FIG. 57 is a diagram for illustrating the contents of a general dictionary. The concepts representing the 
information listed on the left side thereof are described on the right side thereof, as shown in this figure. 

25 FIG. 58 is a diagram for illustrating the contents of a dictionary of a specific field, in this case, the contents 

of the dictionary pertaining to the field of the family register. 

FIG. 59 is a diagram for illustrating the examples of rules for describing general knowledge information to 
be stored in the WKB. 

FIG. 60 is a diagram for illustrating rules for description of the knowledge specific to the field of the family 
30 register to be stored in DKB. 

Next, a procedure of processing family register information in this system embodying the present invention 
by using a practical example of input information will be described hereinbelow as an example of the procedure 
performed by this system. 

FIG. 61 is a diagram for illustrating an example of the read family register information. Here, it is assumed 
35 that the information of this figure is inputted to the system by means of an optical character reader (OCR) and 
that before the processing of the text written in an upper left portion of this family register, the information rep- 
resenting the legal domicile and the owner (namely, the house-holder) written in a right-side portion of this 
family register and the information representing the names of father and mother of each member of this family, 
which are written in the bottom portion of this register, have bee preliminarily processed. 
40 Further, it is supposed that the sentence " 

(The undermentioned person was born atChuo-ku, Metropolis of Tokyo, on June 20, 1944 (19th yearof Showa 
45 era)". His birth was reported by his father and his name was entered into the family register of his father on 
25th of the same month)" is first inputted. 

In this case, the empty instances of the MTRANS1 and the EVENT1 of FIG, 62 are created in the context 
setting step 211 from the knowledge of step S2 of FIG. 56, namely, from the knowledge that the entire statement 

50 of this block of the family register corresponds to the action " S (declaration)", that Is, the MTRANS cor- 
responding to an EVENT which has already occurred. 

Further, the owner of the block is set as the actor of the EVENT, and the Time-Phrase of the EVENT1 is 
expected in accordance with the rules for describing the information in the family register 

Next, the OA processes the aforementioned input information word by word. On referring to the information 

in the dictionary of FIG. 57, the concept for " ' ^ (Heisei)" is found to be the <Era Name> which matches 
the expectation for Time-Phrase. Thus, this concept <Era = Heisei> is filled in the slot 81 of the time2 of FIG. 
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62 by performing the processing in step S511 of FIG. 5. Next, " (first)" is set as <NUMBER = 1> from 

the dictionary of FIG. 57 and is further made to fill the slot <Year (= 1)> in the slot S1 of the time2 of FIG. 62, 

after ensuring that the next character " ^ (year)" corresponds to the <Year Mark> and is itself disregarded 
as a delimiter. Here, a delimiter is utilized as a mark for separating individual items in continuous input infor- 
mation and for indicating what the precedent and subsequent information represent. Similarly, the concept 

" ^ {1 )" is filled in the slot S2 of the time2 considering that the expectation is for month and it is followed by 

the character " ^ (month)" which is the <Month Mark> . Further, the character " (month)" is itself dis- 

regarded as the <Month Mark>, Similarly, the concept " " ° (27th)" is filled in the slot S3 of the time2 
and the character " ^ (day)" is disregarded as the <Day Mark>. 

Next, the concept corresponding to " (Yokohama)" is obtained. This corresponds to the <City 

Mark>. Thus it is inferred from the rule of FIG. 60 that this indicates the beginning of the Address-Phrase. Then, 

the expectation is updated to be for the "address" and the concept for " (Yokohama)" is filled in the slot 

S3 Of the address2. The next character " (city)" Is disregarded as the <City Mark>. Similarly, the concept 

for " (Tsurumi)" is filled in the slot S4 of the address2 and the character " ^ (ku (ward))" is disregarded 

as the <Ku mark>. 

The next character " (at)" is the casemarker. Thus, it is inferred from the rules given in FIG. 60 that 

this indicates the end of the Address-Phrase. Then, the expectation is updated to be for " (birth)" which 

is a kind of the declaration. 

The next word " (birth)" denotes the concept <Birth> from the dictionary given in FIG. 58. Therefore, 

it is inferred from the rule of FIG. 60 that the type of the "Event 1", which is the Object of the declaration 
(MTRANS), is BIRTHCD. Then, the expectation is updated to be for "Time-Phrase". 

The next character " 1^ (same)" corresponds to the <Pronoun 1 >. Therefore, the same information as stor- 
ed in the slot S2 of the time2 is filled in the slot S2 (Month) of the timel after processing the next word which 

is " ^ (month)", the <Month Mark>. Further, the information contained in the slot S1 of the time2 is copied 
onto the slot S1 (Era, Year) of the timel which represents a higher-order part of the time information. As before, 

the character " ^ (month)" is disregarded. Further, the concept corresponding to " ^fa-fc (27th)" is filled 
in the slot S3 of the timel and the character " ° (day)" is disregarded as the <Day Mark>. 

y 

The next word " " (father)" corresponds to the <Person 1 >. It is inferred from this that the Time-Phrase 
has terminated and the expression representing the concept <Person> has begun. Here, note that the slots 



25 



30 



35 



50 



information regarding the person represented by the " (father)" can be obtained from the block information. 

The next word " (declaration)" corresponds to the concept <Declaration> in the dictionary of the 

FIG. 58. Further, it is inferred from the rule of FIG. 60 that the action represented by the MTRANS is performed 
by the person represented by the <Person1>. Therefore, <Person1> is filled in the slot S1 of the MTRANS1. 
Further, the expectation is updated according to the rule of FIG. 60 to be the Time-Phrase of the information 
{Diff Birthplace}. 
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The next expression " (Entry in Family Register)" denotes the concept <Entry in Family Register> 

in the dictionary given in FIG. 58. Here, the Time-Phrase is not found. Thus it is inferred from the rule of FIG. 
60 that the rule "Diff Birth-place" is not applicable in this case and that the analysis of this statement is suc- 
cessful. Thereby, the address represented by the <Address1> is assumed to be the same as the legal domicile 
of the holder of this family register. Thus this information representing the domicile information is copied onto 
the slot <Address1>. Consequently, the information in the slot <Address1> has the address " 

(Tsurumi-ku Yokohama City)". 

FIG. 63 is a diagram for illustrating the result of the processing described just hereinabove. 

Although the preferred embodiment of the present invention has been described above, it should be un- 
derstood that the present invention is not limited thereto and that other modifications will be apparent to those 
skilled in the art without departing from the spirit of the invention. 

The scope of the present invention, therefore, is to be determined solely by the appended claims. 



25 



Claims 

20 

1. A natural language processing system comprising: input means for inputting information represented in 
a natural language; 

a knowledge base for storing linguistic knowledge and general knowledge; 

partition means for partitioning the information, which is inputted by said input means, into words; 
derivation means for referring to knowledge stored in said knowledge base and for deriving con- 
cepts respectively represented by the words obtained by said partition means; and 

integration means for relating the concepts of the words, which are derived by the derivation means, 
with one another by referring to knowledge stored in said knowledge base. 

2. The natural language processing system according to claim 1 further comprises extraction means for ex- 
tracting specific data from the concepts derived by said derivation means. 

3. The natural language processing system according to claim 2, wherein the information represents the con- 
tents of a document of a specific domain, wherein said knowledge base has knowledge of the specific 

35 domain, and wherein said extraction means refers to the knowledge of the specific domain stored in said 

knowledge base and extracts specific data. 

4. A natural language processing system comprising: 

input means for inputting information represented in a natural language; 
40 a knowledge base for storing therein linguistic knowledge and knowledge concerning a domain of 

information to be processed; 

expectation means for expecting a kind of information to be inputted from said input means, on 
the basis of the knowledge stored in said knowledge base; and 

processing means for processing information whose kind is expected by said expectation means. 

45 

5. The natural language processing system according to claim 4, wherein said expectation means has dis- 
crimination means for discriminating a kind of information which can be inputted, before information is 
inputted from said input means. 

^ 6. The natural language processing system according to claim 4, which further comprises preprocessing 
means for performing a preprocessing corresponding to the kind of information, which is expected by said 
expectation means. 

7. The natural language processing system according to claim 6, wherein said preprocessing means pre- 
pares a knowledge structure corresponding to the kind of information, which is expected by said expec- 
tation means. 

8, The natural language processing system according to claim 4, which further comprises: 
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expectation information storing means for storing therein a result of a expectation made by said 
expectation means; and 

updating means for updating contents of information, which is stored in said expectation informa- 
tion storing means, on the basis of said knowledge base. 

9. The natural language processing system according to claim 4, wherein said knowledge base has knowl- 
edge concerning a domain of information to be processed. 

10. The natural language processing system according to claim 9, wherein said expectation means has dis- 
crimination means for discriminating a kind of information which can be inputted, before information is 
inputted from said input means. 

11. The natural language processing system according to claim 9, which further comprises preprocessing 
means for performing a preprocessing corresponding to the kind of information, which is expected by said 
expectation means. 

12. The natural language processing system according to claim 11, wherein the preprocessing means pre- 
pares a knowledge structure corresponding to the kind of information, which is expected by said expec- 
tation means. 

20 13. The natural language processing system according to claim 9, which further comprises: 

expectation information storing means for storing therein a result of a expectation made by said 
expectation means; and 

updating means for updating contents of information stored in the expectation information storing 
means, on the basis of said knowledge base. 

25 

14. A natural language processing system comprising: 

input means for inputting information represented in a natural language; 

a knowledge base for storing therein knowledge concerning a domain of information to be proc- 
essed, general knowledge and linguistic knowledge; 
30 expectation means for expecting a kind of information to be inputted by the input means, on the 

basis of the knowledge stored in said knowledge base; 7 

expectation information storing means for storing a result of an expectation made by said expec- 
tation means, as expectation information; and 

analysis means for analyzing information inputted from said input means by referring to the expec- 
35 tation information stored in the expectation information storing means and to the knowledge stored in the 

knowledge base. 

15. The natural language processing system according to claim 14, wherein said expectation means expects 
a kind of information subsequent to current information on the basis of a result of an analysis of the current 

40 information, which is made by said analysis means, and the knowledge stored in said knowledge base. 

1 6. The natural language processing system according to claim 1 5, which further comprises updating means 
for updating contents of information, which is stored in said expectation information storing means, on 
the basis of a result of an expectation made by said expectation means. 

17. The natural processing system according to claim 14, which further comprises extraction means for ex- 
tracting specific data from the information inputted from said input means, on the basis of a result of an 
analysis made by said analysis means. 

18. The natural language processing system according to claim 17, wherein the information represents the 
^ contentsof a document of a specific domain, wherein said knowledge base has knowledge of the specific 

domain, and wherein said extraction means refers to the knowledge of the specific domain stored in said 
knowledge base and extracts specific data. 

19. The natural language processing system according to claim 14, wherein said input means inputs a set of 
55 sentences, wherein said analysis means processes a set of sentences as sentence by sentence and out- 
puts concepts as a result of an analysis. 

20. The natural language processing system according to claim 19, which further comprises identification 
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means for identifying a beautif ication concept among the concepts outputted by the analysis means. 

21. The natural language processing system according to claim 19, which further comprises connection 
means for connecting associated concepts with one another among a plurality of concepts outputted by 

5 the analysis means. 

22. The natural language processing system according to claim 19, which further comprises separation 
means for separating concepts, which are unsuitably connected with one another, from among concepts 
outputted by the analysis means. 

23. A natural language processing method comprising: 

the input step of inputting information in a natural language; 

the partition step of partitioning the information inputted in the input step; 

the derivation step of referring to knowledge stored in a knowledge base, which stores linguistic 
knowledge and general knowledge, and deriving concepts represented by the words obtained in the par- 
tition step; and 

the integration step of relating the concepts, which are derived respectively correspondingly to the 
words in the derivation step, with one another by referring to the knowledge stored in the knowledge base. 

24. The natural language processing method according to claim 23, which further comprises the ext"- action 
step of extracting specific data from the concepts derived in the derivation step. 

25. The natural language processing method according to claim 24, wherein the information represents the 
contents of a document of a specific domain, wherein the knowledge base has knowledge of the specific 
domain, and wherein in the extraction step specific data is extracted by referring to the knowledge of the 

25 specific domain stored in the knowledge base. 

26. A natural language processing method comprising: 

the input step of inputting information in a natural language; 

the expectation step of expecting the kind of information inputted in the input step on the basis of 
30 knowledge stored in the knowledge base, which stores therein linguistic knowledge and knowledge con- 

cerning the field of information to be processed; and 

the step of processing the information whose kind is expected in the expectation step. 

27. The natural language processing method according to claim 26, wherein the expectation step has the dis- 
35 crimination step of discriminating a kind of information, which can be inputted, before information is in- 
putted in the input step. 

28. The natural language processing method according to claim 26, which further comprises the preprocess- 
ing step of performing a preprocessing corresponding to the kind of information, which is expected in the 

^ expectation step. 

29. The natural language processing method according to claim 28, wherein in the preprocessing step, a 
knowledge structure corresponding to the kind of Information, which is expected in the expectation step, 
is prepared. 

30. The natural language processing method according to claim 26, which further comprises: 

the expectation information storing step of storing a result of an expectation made in the expectation 
step, in an expectation information memory; and 

the updating step of updating contents of information , which is stored in the expectation information 
memory, on the basis of the knowledge base. 

50 

31. The natural language processing method according to claim 26, wherein the knowledge base has knowl- 
edge concerning a domain of information to be processed. 

32. The natural language processing method according to claim 31 , wherein the expectation step has the dis- 
55 crimination step of discriminating a kind of information, which can be inputted, before information is in- 
putted in the input step. 

33. The natural language processing method according to claim 31 , which further comprises the preprocess- 
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ing step of performing a preprocessing corresponding to the kind of information, which is expected in the 
expectation step. 

34. The natural language processing method according to claim 33, wherein in the preprocessing step, a 
knowledge structure corresponding to the kind of information, which is expected in the expectation step, 
is prepared. 

35. The natural language processing method according to claim 31, which further comprises: 

the expectation information storing step of storing a result of an expectation made in the expectation 
step, in an expectation information memory; and 

the updating step of updating contents of information stored in the expectation information mem- 
ory, on the basis of the knowledge base. 

36. A natural language processing method comprising: 

the input step of inputting information in a natural language; 

the expectation step of expecting the kind of Information inputted in the input step on the basis of 
knowledge stored in the knowledge base, which stores therein knowledge concerning a domain of infor- 
mation to be processed, general knowledge and linguistic knowledge; 

the expectation information storing step of storing information, which represents a result of an ex- 
pectation made in the expectation step, in an expectation information memory as expectation information; 
and 

the analysis step of analyzing the information inputted in the input step by referring to the expec- 
tation information stored in the expectation information memory and the knowledge stored in the knowl- 
edge base. 

37. The natural language processing method according to claim 36, wherein in the expectation step, a kind 
of information subsequent to current information is expected on the basis of a result of an analysis of the 
current information, which is made in the analysis step, and the knowledge stored in the knowledge base. 

38. The natural language processing system according to claim 37, which further comprises the updating step 
of updating contents of information, which is stored in the expectation Information memory, on the basis 
of a result of an expectation made in the expectation step. 

39. The natural processing method according to claim 36, which further comprises extraction step of extract- 
ing specific data from the information, which is inputted in the input step, on the basis of a result of an 
analysis made in the analysis step. 

40. The natural processing method according to claim 39, wherein the information represents the contents 
of a document of a specific domain, wherein the knowledge base has knowledge of the specific domain, 
and wherein in the extraction step, the knowledge of the specific domain stored in the knowledge base 
is referred to and specific data is extracted. 

41. The natural language processing system according to claim 36, wherein in the input step, a plurality of 
sentences are inputted, wherein in the analysis step, a set of sentences are processed as sentence by 
sentence and concepts are outputted as a result of an analysis. 

42. The natural language processing system according to claim 41 , which further comprises the identification 
step of identifying a beautif ication concept among the concepts outputted by the analysis means. 

43. The natural language processing system according to claim 41, which further comprises the connection 
step of connecting associated concepts with one another among a plurality of concepts outputted in the 
analysis step. 

44. The natural language processing system according to claim 41, which further comprises the separation 
step of separating concepts, which are unsuitably connected with one another, from among concepts out- 
putted by the analysis means. 

45. A system or method having the features of any combination of the preceding claims. 
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PTRANS 



FIG, 34(a) 



SLOTNAME 

Actor 

Object 

Instrument 

lobject - Beneficiary 

- Reason 

- Direction 

- Via 

From 

Support 

Time 

Connected To 

Tense/Modal 

Qualifier 

o o o 



PERSON 

ACTION/VEHICLE 

PERSON 

ACTION 

PERSON/PLACE 

PERSON/PLACE 

PERSON/PLACE 

PLACE/ORGANIZATION 

TIME 



PTRANS 



Draft Instance of PTRANS 

FIG. 34(b) 
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FIG. 38 
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( END ^ 
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FIG. 39 

a) INPUT : "I would like to meet you" 

OUTPUT : WANT { Actor : PERSON (Aruna Rohra) 

Object : MEET { 

Actor : PERSON (Aruna Rohra) 
Object : PERSON (John Smith) 

Tense : Present form } 

Tense : Future form 
} 

b) INPUT : "I am coming to USA in April. (!) 

Hence I would like to meet you after 10th" (2) 
OUTPUT : PTRANS {••} (for (1)) 

WANT {•■• 

Object : MEET {•••}) (for (2)) 

OUTPUT . P : WANT { • • 

Object : MEET { • • 

Connected .to : (PTRANS {•••}, REASON))} 

c) INPUT : "reserve accommodation and pick up" 
OUTPUT : AGREEMENT {Actor : 

Connected. to : (MEET {•••}, AND. REL)} 

OUTPUT. P : AGREEMENT { ••} 
MEET {•••} 
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FIG. 40 

(a) John sent book to Mary 

(b) Subject (Agent-Person/Organization), Verb (Action/State Descriptor) 

(c) John Noun PERSON (SI (Name=John, S 14 (Sex=Male)) 

(d) Cl->PERSON (SI (Name-John) 

S14 (Sex-Male) 

) 

(e) Lc 1 - > LingCon (Con-C I 

PartofSpeech-Noun 
...) 

(f) send Verb PTRANS (Object-OBJECT, 

Instrument-Post or Vehicle...) 

(g) C2- > PTRANS (Actor-PERSON 

Object-OBJECT 

lobj-benf-PERSON 

...) 

(h) Lc2- > LingCon (Con-C2 

PartofSpeech-Verb 
...) 

(i) Position Information 1 - 

Actor(Person), Verb, Object(Object/Person), Iobj-benf(Person), lobj-direc(Place) 
Position Information2 - 

Actor(Person). Verb, lobj-benf(Person), Object(Object but not Person) 
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FIG. 41 

(a) book Verb AGREEMENT (Object-Room or Accommodation or Ticket. 

lobj-benf-Person 

...) 

book Noun Singular PUBLIC-DOCUMENT 



(b) C3-> PUBLIC-DOCUMENT 
Lc3->LingCon (Con-C3 

PartofSpeech-Noun 
...) 

(c) to CASE-MARKER Preposition 



(d) C4->CASE-MARKER 



(e) Lc4- > LingCon (Con-C4 

PartofSpeech-PREPOSITION 

...) 



(f) Mary Noun PERSON (Name-Mary, Sex-Female) 



(g) C5-> PERSON (Name-Mary,... 

Sex-Female), 
...) 

(h) (Lc5- > LingCon (Con-C5 

PartofSpeech-Noun 
...) 
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FIG. 42 



C2-PTRANS (Actor CI 
Object C3 
lobj-benf C5 
lobj-Reason (ACTION) 
From (PLACE) 
Instrument (VEHICLE) 
Support (PLACE) 
Tense (past) 
...) 

where, 

CI -PERSON (Name (John) 
Sex (Male) 
...) 

C3-PUBLIC-D0CUMENT (Count (Singular)...) 
C5-PERSON (Name (Mary) 

Sex (Female) 

Caserole C4 

...) 

C4-CASEMARKER (role-BENEFICIARY) 
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FIG. 43(1) 

What is the objective/purpose of your visit ?' 

(a) 'to discuss on NLP systems...' 

(b) 'The purpose of my visit is to discuss on NLP systems...' 

(c) 'I would like to visit him to discuss on NLP systems...' 



FIG. 43(2) 

I want to write to John. 

oops!, it is not John but actually meant to write to Mary. 

(d) WANT (Actor- < user > 

Object-MTRANS (Actor- < user > 
Object- 

lobj Benf-PERSON (name-Mary...) 
.-) 

...) 
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FIG. 44(1) 

Input : 

I would like to visit ... 



(a) MEET (i.e., visit someone either socially or on business or for some other purpose) 

(b) MSENSE (i.e., go or come to see a place, an institution etc.) 



FIG. 44(2) 

Continued Input : 

(c) you for a week after 10th 

(d) Human Computer Interaction labs at your university. 
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FIG. 54 



— _ — . 

FAMILY REGISTER : 

51 < Place > which is Legal Domicile 

52 < Person > who is the holder 

53 < ACTION > which is the firstline 

54 {<Page>} 

I . ) 



FIG. 55 



PAGE 



51 { < BLOCK > } 

52 < NUMBER > 
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FIG. 57 



TIME 



(1-64/1^200) 



(1- 



< Era Name > 



(1-12) 
B 

(1-31) 



<Year Mark> 
< Month Mark> 

<Day Mark> 

<DAY PART Mark> 



PLACE : 



<City Mark> 

< Metropolis Mark> 

<Ku Mark> 



PERSON : 



< Relative > 



NUMBERS 



< NUMBER (=1)> 

< NUMBER (=2)> 



7C 



< NUMBER (=10)> 

< NUMBER (=1)> 



OTHERS : 



<CASEMARKER> 
< Pronoun 1 > 
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FIG, 58 



< Birth > 
/^dtj < Declaration > 
^ll' < Death > 

Ale < Entry in Family Register > 

< Family Register > 

< Removal from Family Register > 

Uli'k < Relative > 

< Forwarding > 
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FIG. 



<MTRANS1 > : 

5 1 < Person 1 > who is Agent 

52 < EVENT 1 > which is Object 

53 < Person > who is the lobject 

54 < Action > which is lobj Reason 

55 < Place > which is From 

56 < Action > which is Instrument 

57 < Address 1 > which is Support 

58 <Timel > which is Time 
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<EVENT1 > : 

51 < Block Owner > who is Agent 

52 < Concept > which is Object 

53 < Person > who is the lobject 

54 < Action > which is lobj Reason 

55 < Place > which is From 

56 < Action > which is Instrument 

57 <Address2> which is Support 

58 <Time2 > which is Time. 



< ADDRESS 1 > : 

5 1 < Country > 

52 < Prefecture > 

53 <City> 

54 <Ku> 

55 <Cho> 

56 <Chome> 

57 <Banchi> 



<ADDRESS2> : 

51 < Country > 

52 < Prefecture > 

53 <City> 

54 <Ku> 

55 <Cho> 

56 <Chome> 

57 <Banchi> 



<TIME1 > : 

51 < Year Era > 

52 < Month > 

53 <Day> 

54 < Day part > 

55 < Hours > 

56 < Minutes > 



<TIME2> : 

51 < Year Era > 

52 < Month > 

53 <Day> 

54 < Day part > 

55 < Hours > 

56 < Minutes > 
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FIG. 63 



<MTRANS1 > : 

51 Father who is Agent 

52 < BIRTHCDl > which is Object 

53 < Person > who is the lobject 

54 < Action > which is lobj Reason 

55 < Place > which is From 

56 < Action > which is Instrument 

57 < Address 1 > which is Support 

58 <Timel > which is Time 



<BIRTHCD1> : 

51 < Block Owner > who is Agent 

52 < Concept > which is Object 

53 < Person > who is the lobject 

54 < Action > which is lobj Reason 

55 < Place > which is From 

56 < Action > which is Instrument 

57 < Address2 > which is Support 

58 <Time2>which is Time 



< ADDRESS 1> : 

51 < Country > 

52 < Prefecture > 

53 Yokohama City 

54 Tsurumi-Ku 

55 <Cho> 

56 <Chome> 

57 <Banchi> 



<ADDRESS2> : 

5 1 < Country > 

52 < Prefecture > 

53 Yokohama City 

54 Tsurumi-Ku 

55 <Cho> 

56 <Chome> 

57 <Banchi> 



<TIME1> : 



51 [Year=l Era=Heisei] 

52 [Month=l] 

53 [Day=29] 

54 < Day part > 

55 < Hours > 

56 < Minutes > 



<TIME2> 



51 [Year=l Era=Heisei] 

52 [ Months 1] 

53 [Day=27] 

54 < Day part > 

55 < Hours > 

56 < Minutes > 
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